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SIMULATING  DISCOUNTED  COSTS 
by 
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and 

Peter  W.  Glynn3 

ABSTRACT 

We  numerically  estimate,  via  Monte  Carlo  simulation,  the  expected  infinite-horizon 
discounted  cost  d  of  running  a  stochastic  system;  we  permit  the  discount  rate  to  be  state 
dependent.  By  exploiting  various  types  of  stochastic  structure,  we  beat  the  naive  strategy 
of  estimating  a  finite-horizon  approximation  to  d.  Efficient  estimators  are  obtained  for 
systems  which  are  semi-Markov  and/or  regenerative.  These  estimators  are  then  ranked 
with  respect  to  asymptotic  variance  as  a  function  of  computer-time  budget  and  discount 
rate. 
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INTRODUCTION 

)  In  many  settings,  discounted  costs  arise  naturally.  This  paper  describes  simulation 
methodologies  for  estimation  of  expected  discounted  costs  associated  with  systems  that 
exhibit  stochastic  fluctuations.  Such  techniques  are  important  for  numerical  computation 
of  discounted  costs  for  stochastic  processes  in  which  conventional  numerical  methods  either 
fail  to  apply  or  are  inefficient.  Examples  of  such  processes  include  non-Markov  processes  or 
infinite  state  space  Markov  chains.  The  discussion  given  here  of  simulation  algorithms  for 
the  discounted  cost  problem  also  merits  interest  to  the  extent  that  it  provides  an  excellent 
vehicle  for  illustrating  several  sophisticated  “variance  reduction*  methods  for  stochastic 
simulation.  These  techniques  are  more  accurately  called  efficiency  increase  techniques,  as 
we  shall  see. 

To  formulate  the  estimation  problem  mathematically,  we  let  X  =  (A'(t)  :  t  >  0}  be  a 
stochastic  process  taking  values  in  a  state  space  5.  Suppose  that  f,g  are  two  real-valued 
functions  defined  on  5,  in  which  /(x)  represents  the  cost  of  running  X  in  state  x  and  g(x) 
corresponds  to  the  (positive)  discount  rate  in  state  x.  Then 

0=  H  txp(-V(,))f[X[s))d»  (l) 

Jo 

is  the  infinite-horizon  discounted  cost,  where  V(a)  =  /„*  jpf(u))du.  Our  goal  in  this  paper  is 
to  construct  Monte  Carlo  simulation  algorithms  for  numerically  evaluating  d  =  ED. 

We  now  describe  the  layout  of  the  rest  of  this  paper.  Section  2  develops  a  naive  estima¬ 
tor  for  d  based  on  truncation  of  the  infinite-horizon  integral,  and  studies  its  relevant  theory. 
In  Section  3,  an  estimator  based  on  randomizing  the  truncation  point  is  developed,  and  it 
is  shown  that  for  large  computational  budgets,  this  estimator  beats  the  naive  truncation 
estimator  of  Section  2.  Section  4  shows  how  to  exploit  semi-Markov  process  structure  to 
improve  the  efficiency  of  the  randomized  estimator  of  Section  3  by  “conditioning  out”  the 
holding  times.  In  Section  5,  an  estimator  which  makes  use  of  regenerative  structure  is 
explored,  whereas  Section  6  studies  an  estimator  which  utilizes  both  semi-Markov  and  re¬ 
generative  structure  to  obtain  efficiency.  Finally,  in  Section  7,  we  compare  the  asymptotic 
variances  of  the  above  five  estimators  when  the  discount  rate  is  small;  a  small  discount  rate 
is  natural  in  many  economic  settings.  General  conclusions  on  the  choice  of  estimator  can 
also  be  found  there.  Unless  otherwise  indicated,  all  proofs  are  deferred  to  the  Appendix 
to  make  the  paper,  easier  tp  read. 

The  table  below  summarizes  features  of  the  five  estimators  that  we  consider. 
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*1 

62 

*4 

Ss 

truncates 

Yes 

No 

No 

No 

No 

“conditions 
out”  hold¬ 
ing  times 

No 

No 

Yes 

No 

Yes 

uses 

regenerative 

structure 

No 

No 

No 

Yes 

Yes 

becomes  less 
efficient  as 
discount  rate 
decreases 

No 

Yes 

Yes 

No 

No 

variance 

per 

run 

see 

below 

Si  beats  &7 

Si  beats  64 

overall 

efficiency 

always  loses 
for  large 
enough 
computer 
budgets 

Si,  Si  always 
lose  to  64,  Si 
for  small 
enough 
discount  rate 

depends  on 
expected  work  per 
run  as  well 
as  variance 
per  run; 
see  text 

Efficiency  of  a  simulation  algorithm  depends  on  both  the  time  required  to  generate 
observations  and  their  variance.  In  the  setting  of  continuous-time  Markov  chains,  S3  and 
are  more  cheaply  simulated  than  and  <S«;  since  they  also  reduce  variance,  they  are  clear 
winners  for  this  class  of  processes.  For  small  discount  rates,  Si  is  always  more  efficient  than 
$»;  for  moderate  to  large  rates,  no  universal  conclusion  is  possible.  If  the  process  simulated 
is  non-regenerative,  the  discount  rate  is  small,  and  the  computer-time  budget  is  modest, 
then  Sx  probably  beats  the  other  estimates.  Our  mathematical  analysis  in  the  following 
sections  supports  these  assertions. 
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2.  A  TRUNCATION  ALGORITHM 

Naive  Monte  Carlo  simulation,  based  directly  on  (1),  is  impractical  since  generating 
the  r.v.  D  generally  requires  an  infinite  amount  of  computation  time.  A  straightforward 
alternative  truncates  the  r.v.  D  at  some  finite  time  horizon  0,  yielding  the  quantity 


D(P)  =  /'«xp(-V(.))/(X(,))d«. 
Jo 


Given  a  computational  budget  t,  it  is  clear  that  the  truncation  point  0  should  increase 
with  t.  (Observe  that  a  sample  mean  estimator  based  on  D(0)  with  0  fixed  converges  to 
ED(0),  which  is  in  general  not  equal  to  d.) 

As  a  consequence,  we  need  to  define  a  sampling  plan  {0(t)  :  t  >  o},  in  which  /9(t) 
corresponds  to  the  truncation  point  associated  with  computer  budget  t.  Assuming  that 
the  time  required  to  generate  a  replicate  of  D(0)  is  ci0  (ci  >  0),  we  find  that  the  number 
n(t)  of  runs  completed  with  budget  t  is  [t(cip{t)\.  Given  t,  our  estimator  for  d  will  then  be 


s  M  + wm) +■■■+ Dnit)m)y,  n(o  >  i 

U  \0;  n(t)  m  0 


(2) 


where  (A»(  ) :  »  >  1}  is  a  sequence  of  i.i.d.  replicates  of  £>(  ). 

We  now  investigate  the  choice  of  sampling  plan  which  optimizes  the  behavior  of  the  es¬ 
timator  $i(t).  Given  the  exponential  character  of  discounting,  it  seems  reasonable  to  expect 
that  the  bias  of  5i(t)  behaves  like  aexp(-c/?(t))  for  some  constants  o,c.  This  expectation  can 
be  justified  when  X  is  a  regenerative  stochastic  process;  see  Glynn  and  Whitt  (1987).  In 
any  case,  it  then  follows  that  the  mean  square  error  (MSE)  of  Sx  (t)  is  given  approximately 
by 

—  d)2  *»  Cx(wr  D)^^-  +  a3exp(— 2c0(t)) 

The  choice  of  P(t)  which  minimizes  the  above  MSE  expression  is 


0*(t)  =  A;  +  A;iogt 

where  A5  =  -  k>g(ci  y*r  D/2a3c)/2e,  AJ  =  l/2c. 

Theorem  1  below  shows  that  the  above  approximations  can  be  justified  rigorously; 
its  proof  relies  heavily  on  the  general  theory  of  replication  estimators  of  the  form  (2), 
as  described  in  Fox  and  Glynn  (1947).  The  following  (reasonable)  assumptions  will  be 
needed: 

HI.  f,g  are  strictly  positive  functions  on  S. 


H2.  0  <  var  D  <  oo 

H3.  b(P)  =  d- ED(0 )  ~  ae~eP  as  p  -*  oo  for  some  constants  o,e. 

We  require  that  /  be  strictly  positive  merely  to  simplify  the  technical  statements  of  the 
theorems  presented  in  this  paper.  It  is  not  necessary  and  can  be  replaced  by  suitable 
(cumbersome)  absolute  integrability  hypotheses  on  /. 

Theorem  1.  Assume  H1-H3  and  suppose  p*  is  defined  by  (3).  Then: 

i)  If  p{t)  =  p*(t),  then 

ii)  If  p{t)/P’{t)  -►  k  >  1  as  t  -*  oo,  then 

iii)  If  p[t)/pm{t)  -*K<last-+oo,  then 

Note  that  part  iii)  forces  the  selection  p(t)  >  P‘(t)  for  large  enough  t.  On  the  other 
hand,  if  the  constant  k  appearing  in  ii)  is  strictly  greater  than  one,  the  variance  of  the 
limiting  normal  r.v.  is  greater  than  that  obtained  when  p  -  P'.  We  conclude  that  Theorem 
1  shows  that  the  asymptotically  optimal  choice  of  sampling  plan  is  p  =  p‘. 

Implementing  this  choice  requires  determining  a  and  e.  (A  glance  at  the  proof  of 
Theorem  1  shows  that  in  fact  e  is  the  crucial  parameter,  in  the  sense  that  if  p(t)  =  logt/2c+$, 
then  convergence  result  i)  always  ensues,  regardless  of  the  choice  of  f).  Theorem  1  indicates 
that  if  one  is  to  guess  a  choice  for  c,  it  is  better  to  underestimate  c  than  overestimate  it. 
In  particular,  suppose  that  one  uses  P(t)  =  log  t/2c'  +  f  with  c'  <  e.  Then,  fii(t)  will  satisfy 
relation  ii)  with  *  =  e/c'\  on  the  other  hand,  if  c'  >  c ,Si{t)  has  the  poor  convergence  structure 
associated  with  iii).  An  underestimate  e'  of  e  is  always  available  when  A  =  mf{j(x) :  x  e  5}  >  o, 
namely  c'  =  A. 

Theorem  1  also  shows  that  even  if  p  is  chosen  optimally,  the  best  possible  rate  of 
convergence  is  0{\/log  t/t).  This  is  unsatisfactory  in  comparison  to  the  canonical  rate  of 
0( l/Vt)  typical  of  Monte  Carlo  simulation.  Thus,  the  straightforward  truncation  approach 
of  this  section  appears  inefficient  for  large  computational  budgets  t,  and  the  investigation 
of  alternative  algorithms  is  warranted.  Heuristic  adjustments  to  p‘(t)  may  be  appropriate 
when  the  computer-time  budget  is  only  moderate. . 
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3.  A  RANDOMIZED  ESTIMATOR 


A  principal  difficulty  in  estimating  d  is  that  the  naive  Monte  Carlo  estimator  based  on 
replicates  of  0  is  inadmissible:  it  requires  infinite  time  to  simulate  a  single  observation  of 
0.  Hence,  it  is  clear  that  one  must  carefully  consider  the  computational  effort  required 
per  observation  in  order  to  properly  assess  the  efficiency  of  an  estimator.  Hammersley  and 
Handscomb  (1964)  proposed  evaluating  the  efficiency  of  a  Monte  Carlo  procedure  via  the 
formula 


Efficiency  =  (Time)-1  •  (Variance)'  (3) 

where  Time  =  expected  computation  time  per  observation 
Variance  =  variance  per  observation. 

Glynn  and  Whitt  (1986)  rigorously  justify  this  criterion.  Thus,  the  efficiency  of  an  esti¬ 
mator  may  be  improved  by  reducing  computation  time  per  observation  and/or  reducing 
variance.  An  important  implication  of  this  observation  is  that  the  efficiency  of  an  estimator 
may  be  improved  by  increasing  the  variance  per  observation  provided  that  the  computa¬ 
tional  time  required  per  observation  is  appropriately  decreased.  The  estimator  proposed  in 
this  section  has  precisely  this  property.  Specifically,  the  variance  per  observation  is  greater 
than  that  of  0,  but  the  observations  can  be  generated  in  finite  time. 

Suppose  R  is  an  exponential  r.v.  with  mean  one,  which  is  independent  of  X.  Set 

0(1)  =  /  f(X(t))dt 
Jo 

where  V“l(  )  =  inf{t  >  o  :  V (t)  >  ■};  we  call  0(1)  a  randomized  estimator  since  it  involves 
adding  additional  randomness  to  the  probability  space.  Note  that  0(1)  requires  simulating 
X  only  up  to  time  V-l(fi),  and  can  therefore  be  generated  in  finite  time  if: 

H4.  V(oo)  s  /0°°  g(X(t))dt  =  oo  a.». 

When  g  =  a,  then  EV~l(R)  =  i/a  and  varV-,(f?)  =  i/a3.  Thus,  both  the  expected  work  per 
run  and  the  variance  per  run  are  (generally  significantly)  affected  by  a. 

The  following  proposition  shows  that  efficient  estimation  of  d  can  be  based  on  0(1), 
since  Eb(  1)  =  d. 

Proposition  1.  Assume  Hi,  H2,  H4.  Then, 

0  *  0{£(1)|X},  so  that  Eb(  1)  =  d. 
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Standard  properties  of  conditional  expectation  guarantee  that  v» tD  <  varD(i),  so  that 
the  variance  per  observation  is  increased  by  using  an  estimator  based  on  £>(  1).  To  analyze 
the  efficiency  of  17(1),  we  will  obtain  a  central  limit  theorem  (CLT)  for  the  corresponding 
estimator.  Let  {(^(lJ.V"1  (/Z»))  :  n  >  1}  be  a  sequence  of  i.i.d.  copies  of  (Z7(l),  Vl(fl)). 
Given  t  units  of  computation  time,  the  number  of  observations  generated  is 

JVi(t)  =  mw{»  >  0  :  «i(Vi"l(Jli)  +  +  V„-l(*«))  <  *} 

(disregarding  overhead  for  generating  the  Jfc’s)  and  the  estimator  6?(t)  available  after  t  units 
of  computational  effort  have  been  expended  is 

I  Wfrr^I(1)  +  "  +^M*>(1));  ^W-1  (4) 

'  lO;  N^t)  =  0. 

Theorem  2  shows  that  $i(t)  converges  at  rate  0(t-1/2);  it  can  also  be  used,  in  a  straight¬ 
forward  way,  to  obtain  confidence  intervals  for  d. 

Theorem  2.  Assume  Hi,  H4,  and  ED( l)a  <  oo.  Then, 

tl'3(Sa(t)  -d)=>  {av3sD{  1)  EV^r))1/3^,:!) 

Furthermore,  varD(l)  =  2/0°°  f*  E{*xp(-V{t))f(X(s))f(X{t))}d3dt  -  and  EV-l(R)  = 
/“EexpJ-Ftt))*. 

This  theorem  confirms  the  efficiency  criterion  specified  by  (3),  in  the  sense  that  the 
asymptotic  variance  of  the  limiting  normal  r.v.  is  precisely  the  reciprocal  of  the  efficiency 
given  by  (3).  In  the  next  three  sections,  we  will  describe  estimation  algorithms  that  will 
increase  the  efficiency  of  5j(t)  by  reducing  the  variance  of  D(l)  without  increasing  the 
average  amount  of  time  required  to  generate  an  observation;  the  improved  efficiency  will 
be  obtained  by  utilizing  special  stochastic  structure  in  X. 

4.  DISCRETE-TIME  CONVERSION  FOR  SEMI-MARKOV  PROCESSES 
In  this  section,  we  construct  an  efficient  estimator  for  d  which  exploits  semi-Markov 
process  (SMP)  type  structure;  the  idea  is  to  eliminate  some  of  the  variance  in  D( l)  by 
conditioning  on  the  embedded  discrete-time  process  which  describes  the  sequence  of  states 
visited  by  X.  This  “discrete-time  conversion"  is  similar  in  spirit  to  the  estimator  discussed 
in  Fox  and  Glynn  (1986)  for  estimation  of  steady-state  quantities  associated  with  SMP’s. 
It  “undoes”  some  of  the  variance  increase  due  to  randomization. 
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Specifically,  we  assume  in  this  section  the  existence  of  a  discrete-time  process  Y  = 

(y„  :  n  >  0}  taking  values  in  S  and  a  strictly  increasing  sequence  of  random  times  {Sn:n>  o|] 
such  that 

H5.  i)  X(t)  =  E“,0  YnI(Sn  <  t  <  Sn+l)  where  S0  =  0. 

ii)  {0n  :  n  >  0}  is  conditionally  independent  given  K,  where  fin  =  S„+t  -  S„. 

iii)  P{pn  e  dt\Y}  =  F(Yn,Yn+i,dt )  for  some  family  of  distributions  F  on  S  x  S. 

H5  generalizes  the  notion  of  SMP,  since  we  do  not  require  here  that  Y  be  a  Markov  chain. 

To  apply  discrete-time  conversion,  we  let  N(t)  =  ma x{n  >  0  :  Sn  <  t}  be  the  number  of 
transitions  of  X  by  time  t,  and  set  M  =  N[V~l(r)).  From  H5,  we  get 

.  /-s,+,  rv-‘(R) 

b(i  )=£/  /(*(«))*+/  f(x(t))dt 

Js,  Jsu 

=  £*  /( W  +  HYu){V-1  W  -  Su) 

0 

The  “discrete-time”  estimator  of  this  section  is  based  on  .0(2)  =  E{D(i)\Y,M},  that  is 

6(2)=X;1/(yj)E{ft|K,M} 

j-0  l5' 

nYu)E{V~^r)-Su\Y,M). 

Let  <p[ x,y,X)  be  the  Laplace  transform  of  the  distribution  F(*,y,dt)  defined  by 

<(>(x,y,X)=f  e~X<F(z,  y,dt). 

J  |0,oo) 

It  is  easy  to  show,  using  a  dominated  convergence  argument,  that  the  derivative  <p'( x,y,  A) 
with  respect  to  A  exists  for  all  positive  arguments  A.  Let  {(*?», <p'k)  k  >  0}  be  the  sequence 
defined  by 

Vk  =  p{Yk,Yk+ltg{Yk)) 

<p,k  =  V,(Y„,Yk+ug(Yk)). 

With  this  notation  in  hand,  the  next  proposition  calculates  the  conditional  expectations 
appearing  in  (5),  as  well  as  the  conditional  distribution  of  M  given  Y. 

Proposition  2.  Assume  Hi,  H4,  and  ED(l)  <  oo.  Then,  for  k  <  m 


P[M  =  m\Y)  =  J]  PAi-Pm) 
1-0 

E{fik\Y,M  =  m}  =  -<p'kf<pk 
E(K-|*)-S«|K,M}  - 
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As  a  consequence  of  Proposition  2,  we  find  that 


fiffl  -  -  E  /«)»;/»<  W t-ri0fer1  ,6) 

Formula  (6)  shows  that  we  get  6(2)  by  generating  y  up  to  time  Af ,  where  Af  is  generated  by 
using  the  conditional  distribution  given  in  Proposition  2.  The  following  algorithm  can  be 
used  to  produce  r.v.’s  with  the  distribution  of  6(2);  its  validity  follows  immediately  from 
(6),  noting  that  Af  is  generated  by  “inversion”. 

Algorithm  A: 

1.  Generate  a  random  variate  17,  uniform  on  (0, 1). 

2.  Generate  y0. 

3.  Set  m «—  0,  A  *—  t,r  —  0. 

Comment:  now  A  =  P{M  >  OjK). 

4.  Generate  Ym+l. 

5.  Set  A  «—  A ipm. 

Comment:  now  A  =  P{M  >  m  +  Ijy) 

6.  If  U  >  A,  then 

o  set  p  -  /(y~)li>ar>.?rl  -  r. 

Sommsgt:  now  m  =  m 

ii)  exit 

7.  Else, 

i)  set  r  <-  r  +  f{Ym)<p'J'Pm 

ii)  set  m  m  +  1 

iii)  go  to  step  4. 

An  estimator  A3(t)  based  on  a  sequence  {(6„(2),M„) :  n  >  1}  of  i.i.d.  replicates  of  6(2)  can  be 
constructed  analogously  to  $2(t)  (see  (4)).  The  estimator  $3(t)  so  defined  is  a  sample  mean 
of  Nj[t)  observations  of  6(2),  where  N2{t)  is  the  number  of  observations  generated  in  t  units 

of  computer  time.  To  a  first  approximation,  7V2(t)  =  max{n  >  0 :  c2(Afi  h - 1-  Mn)  <  t}  where 

c3  is  the  computer  time  required  to  increment  m  by  one  in  Algorithm  A.  (This  disregards 
the  set-up  time  to  generate  Af,  and  the  fact  that  the  effort  required  to  execute  steps  4  to 
7  of  Algorithm  A  depends  on  the  random  states  occupied  at  times  m  and  m  +  i.) 

The  following  CLT  describes  the  behavior  of  the  estimator  £j(0,  and  can  be  used  to 
construct  confidence  intervals  for  d. 
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Theorem  3.  Assume  HI,  H4,  and  ED(2)7  <  oo.  Then, 


t1/a(M0  -  <0  =►  (c3EM  •  var  5( 2))1/3  ■  N(0, 1) 


as  £  — *  oo. 

The  proof  of  this  result  follows  immediately  from  Section  5  of  Glynn  and  Whitt  (1986). 
Since  5( 2)  =  E{D(l)\Y,M},  it  follows  by  the  principle  of  conditional  Monte  Carlo  (see  Brat- 
ley,  Fox,  and  Schrage  (1983))  that  var 5(2)  <  var  5(1).  Thus,  the  estimator  $3(t)  is  obtained 
from  62(t)  by  reducing  the  variance  per  observation.  However,  as  Theorems  2  and  3  point 
out,  an  efficiency  increase  is  obtained  only  if  (c2£M)/(c1£V_1(iJ))  <  var  5(1)/ var  5(2). 

To  fully  understand  this  condition,  note  that  var5(l)/var5(2)  reflects  the  degree  to 
which  randomness  in  5(1)  is  due  to  the  holding  times  as  opposed  to  the  embedded 
sequence  Y.  On  the  other  hand,  the  ratio  c2EM/ciEV~1(R)  describes  the  complexity  of 
generating  a  5(2)  observation  relative  to  a  5(1)  variate.  Observe  that  both  types  of  obser¬ 
vations  require  generating  Y  up  to  time  M\  the  difference  is  that  5(1)  additionally  requires 
generating  the  holding  times  while  5(2)  involves  the  Laplace  transform  quantities  <pj 
and  <p'r  If  the  F(x,y,dt)'s  are  distributions  having  Laplace  transforms  that  are  easily  nu¬ 
merically  evaluated  (as  is  the  case  with  gamma  r.v.’s,  for  example),  then  the  (possible) 
increase  in  effort  involved  in  passing  from  5(1)  to  5(2)  should  be  modest;  in  these  circum¬ 
stances,  63(t)  is  more  efficient  than  5a(t).  For  a  more  detailed  comparison  of  “discrete-time” 
estimators  with  their  “continuous-time”  analogs,  see  Section  2  of  Fox  and  Glynn  (1986). 

5.  ESTIMATION  FOR  REGENERATIVE  PROCESSES 

We  assume  now  that  X  is  a  (possibly)  delayed  regenerative  process  with  regeneration 
times  0  <  To  <  Tt  <  . . .  (If  X  is  non-delayed,  set  T0  =  0.);  thus,  we  do  not  require  in  this 
section  that  X  satisfy  the  semi-Markov  hypothesis  H5.  The  independence  of  regenerative 
cycles  implies  that 

d  =  EA  (0)  +  EC{  0)  EK  (0)  (7) 

where 

A(i)  =  £  exp  J  ffWIU  +  •))*)  f(X(Ti-i  +  t))dt 

C(»)  =  exp  (-  jT  S(X(7y_,  +  t))dt^  .  .... 

*(.)  =  jH  exp  (-  J*  giXpi-t  +  s))ds^j  f[X(T,  +  t))dt 
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and  u  =  Ti  -  T<_ x.  A  similar  analysis  of  EK( 0)  shows  that 


EK{0)  =  EA{  1)  +  EC(1)EK{1). 

But  A(l)  has  the  same  distribution  as  A(0)  by  the  regenerative  property,  so  EK( o)  =  EK{1). 
We  conclude  that  EK(0)  =  EA(  1)  (l  -  EC(l))-1.  Substituting  into  (7)  yields 

d  =  EA(0)  +  EC(0)EA[1)  •  (1  -  £C(1))-1.  (8) 

Equation  (8)  suggests  that  d  can  be  estimated  by  simulating  regenerative  cycles.  Since 
each  regenerative  cycle  can  be  generated  in  finite  time,  independently  of  g,  we  will  avoid 
the  problems  inherent  in  trying  to  generate  D  explicitly,  or,  when  the  discount  rate  is  small, 
in  randomizing  as  in  Sections  3  and  4.  (See  also  Section  7.)  In  the  discounting  context,  it 
is  important  to  allow  the  possibility  that  X  is  a  delayed  regenerative  process  (as  opposed 
to  steady-state  simulation).  For  example,  if  one  is  asked  to  compute  the  discounted  cost 
for  a  Markov  chain  initiated  with  a  distribution  concentrated  on  more  than  one  point,  this 
generalization  would  be  required. 

Since  (8)  involves  two  different  types  of  cycles  (delayed  and  non-delayed) ,  it  offers  the 
possibility  to  stratify  the  computation  effort  so  as  to  maximize  the  efficiency  of  the  resulting 
estimators.  Given  a  computational  budget  t,  we  allocate  a  proportion  p  to  generating  pairs 
(C(o),  A(0))  and  a  proportion  q  =  1-p  to  simulating  the  pairs  (C(l),  A(l))  from  the  non-delayed 
cycle.  An  estimator  $4(t)  is  then  obtained  by  substituting  the  resulting  sample  means  in 
(8). 

To  be  precise,  let  {(<?„(»),  An(t),  rni) :  n  >  1  (t  =  0, 1)  be  two  independent  sequences  of  i.i.d. 
random  vectors  where  (Cn(t),An(t))  shares  the  same  distribution  as  (C(»),  >4(x)) ,  and  where 
rni  represents  the  length  of  the  corresponding  cycle  used  to  obtain  (Cn( i),A„(i)).  Thus,  if 

we  set  po  =  p,p i  =  q,  then  N'[t)  =  max{n  >  0  :  ci(tu  + - h  r„,)  <  p,t}  is  the  number  of  type  i 

cycles  completed  by  time  t.  Put 

(<?,(*)  At(i))  —  |  + - (CATMqWi^AfMtjW))  >  JV*(£)  -  1 

Then,  the  estimator  S4(t)  is  given  by 


S4(t)  =  At(0)  +  Ct(0)At(l)  (1  -  <7t(l))~‘ 

To  analyze  the  behavior  of  this  estimator,  we  derive  a  CLT  for  54  (t).  (Again,  this  can  also 
be  used  to  produce  confidence  intervals  for  d.)  We  require  that: 
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H6.  Eti  <oo  (*'  *  0, 1). 

Theorem  4.  Assume  Hi,  H2,  and  H6.  Then,  for  0  <  p  <  l, 

t1/2(M*)  -  <0  =*  K3/p  +  (0,  l) 

as  t  -» oo,  where 

00  =  ex  var(j4(0)  +  C(0)  •  £A(1))  •  Er, o 

=  ci  +  C{1) '  EK(l))  ‘  Eri- 

To  optimize  the  performance  of  $4(t),  we  select  p  to  minimize  the  asymptotic  variance 
term  o%jp  +  a\/q.  It  is  easily  verified  that  the  minimizer  is  given  by 

p*  =  ?o(?o  +  ^i)-1 


(provided  <r0  +  o\  >  0)  where  a\  =  s/of,  in  which  case  the  corresponding  variance  is  (o0  + 
<7i)2.  To  compare  the  efficiency  of  the  estimator  with  the  previous  ones,  in  particular 
the  truncation  estimator,  it  is  useful  to  relate  the  coefficients  defining  o£  and  o\  to  var  D 
appearing  in  Theorem  1. 

Proposition  3.  Assume  Hi  and  H2.  Then 

var  17  =  var(,l(0)  +  <7(0)  EK(  1))  +  ^  f  ^)a  var(/t(l)  +  C(l)  (1))) 

To  aid  in  comparison,  note  that  £C(0)3  >  (£C(0))2  and  E( l  -  C(i)2)  >  (l  -  EC( l))2.  (For  the 
second  inequality,  0  <  C(l)  <  l  so  EC(  1)  >  EC{ l)2.  Hence  1  -  EC( l)  <  l  -  EC( l)3.  But  since 
0  <  EC(\)  <1,  (1  -  J£C(l))2  <  l  -  EC{1).)  In  the  non-delayed  case  where  C(0)  =  l,  A(o)  =  o,  we 
choose  p=  l  (obviously).  Theorems  1  and  4  then  suggest  that 


var$i(t)/v»rMO  ~  —  - — —  . 

11  J/  41  ;  2bErx  (1  -  £C(1)2) 

We  conclude  that  if  t  »  exp(26Erl),  it  is  better  to  use  £4(t). 
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6.  ESTIMATION  FOR  REGENERATIVE  SEMI-MARKOV  PROCESSES 
In  this  section,  we  illustrate  how  the  methods  of  Section  4  and  5  can  be  combined  to 
obtain  an  estimator  65(t)  which  exhibits  the  best  features  of  S3(t)  and  <S4 (t) .  In  particular, 
6s(t)  exploits  the  regenerative  structure  of  X  while  “filtering  out”  the  variance  in  S4(t)  due  to 
holding  time  randomness;  the  latter  property  is  achieved  by  using  discrete-time  conversion. 

Returning  to  the  set-up  of  Section  3,  we  now  assume  that  the  embedded  sequence  Y  is 
regenerative.  Thus,  we  require  that  Y  possess  regeneration  times  0<Uo<Ux<  and  set 


U-i  =  0,  r)i  =  Vi  -  By  the  conditional  independence  of  the  0/s  given  Y ,  it  follows  that 
the  random  times  T<  =  are  regeneration  times  for  X.  Hence,  (8)  is  valid  for  the  T/ s;  as 
an  immediate  consequence,  we  obtain  the  identity 

d  =  EA(0)  +  EC(0)EA(  1)  (1  -  £C(1))-1  (10) 


where  A(t)  =  £{A(*)|y},  C(i)  =  JS{C(t)|K}.  To  compute  the  conditional  expectations  appear¬ 
ing  in  (10),  observe  that 


wwin 


=sj  n1  «*p(-s(w)iyj 


■  II 


and 


£{A(*)|y}  =  e  |  if  j*"1  «p  (-  j f  g(x{Ti- 1  +  .))*)  /(y,)|y  j 


(u.-i  y-i  ,p,  ) 

n  n  «*-•<*>*>/  exP(-ff(yy)o«it/(y,)iy  y 

fc»0  1/0 


u(-i  y-i 

*  n  n«*-w> 

jat/i-j  fc=0 


Ml 

M)' 


Given  the  above  formulas,  it  is  straightforward  to  generate  the  pairs  (C(t),A(t))  by  simu¬ 
lating  the  sequence  Y.  As  in  Section  4,  the  computational  effort  may  be  assigned  so  that 
a  fraction  p<  of  the  total  time  t  is  delegated  to  generation  of  pairs  (C(»),  A(t')),  (»'  =  0, 1).  An 
estimator  68(t)  can  then  be  constructed  analogously  to  £«(t). 

We  can  derive  a  CLT  for  $6(0  which  describes  its  convergence  and  can  be  used  for 
confidence  interval  estimation;  the  proof  is  analogous  to  that  of  Theorem  4  and  is  therefore 
omitted.  The  result  (Theorem  5  below)  assumes  that  the  computational  effort  required  to 
generate  (C(t'),A(*))  is  c3«k.  The  constant  c3  reflects  the  difficulty  of  simulating  the  chain  Y 
and  numerically  evaluating  the  p/s  (we  do  not  assume  that  c2  =  e3  since  the  discrete-time 
algorithm  of  Section  4  also  involves  numerical  evaluation  of  the  derivatives  of  the  p/s, 
which  may  be  harder.  For  continuous-time  Markov  chains,  however,  both  p,  and  p'}  have 
simple  closed  forms.) 

H7.  Erji  <00  (*  =  0, 1). 

Theorem  5.  Assume  Hi,  H2,  and  H7.  Then,  for  0  <  p  <  1, 


*l/3(«a(0  ~  d)  *  [*l/p  +  9l/qy/7N( 0, 1) 
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as  t  -» oo,  where 


b%  =  c3  •  vv(E{A(0)  +  £7(0)  •  £tf(l)|y»  •  Et, o 

5?  =  *  (i.^il))2^^1)  +  C{1) ' ' Eni- 

The  principle  of  conditional  Monte  Carlo  again  guarantees  that  5%  <  a%,  b\  <  a\\  the 
amount,  of  variance  reduction  depends  on  the  extent  to  which  the  randomness  of  D  is  due 
to  the  holding  times.  As  an  immediate  consequence,  we  find  that  if  ci En  «  c3Ei 71,  5S (t) 
is  more  efficient  than  £«(t).  The  following  proposition  relates  5%  and  b\  to  var(E{D|y});  its 
proof  is  similar  to  Proposition  3  and  is  omitted. 

Proposition  4.  Assume  HI  and  H2.  Then, 

var(£{D|y})  =  var(i(0)  +  6(0)  ■  EK(  1))  +  var(i(l)  +  0(1)  ■  £A(1)). 

This  result  can  be  used  to  compare  the  efficiency  of  Si  (t)  to  S6(t)  when  Y  is  non-delayed. 
By  arguing  as  in  (9),  we  find  that 

varSi(t)  log  t  ■  ci  var  D  ■  (1  —  ■EC'(l))2 

var  S5(t)  ~  2bEm  c3  ■  var(£{Z7|y })  (1  -  EC( l)2) ' 

Here,  we  find  that  if  t »  exp(2bc3Ern  v*r(E{D\Y})/(vaxD  ■  ci)),  S5(t)  is  more  efficient  then  5i(t). 

7.  ANALYSIS  OF  EFFICIENCY  FOR  SMALL  DISCOUNT  RATES 

In  this  section,  we  study  the  relative  efficiencies  for  small  discount  rates  of  the  methods 
considered  above.  Smallish  discount  rates  arise  naturally  in  many  economic  contexts  (e.g. 
low  inflation  rate  settings)  and  a  considerable  literature  has  developed  on  this  topic.  (See, 
for  example,  Veinott  (1969)  or  Whitt  (1972).) 

To  make  our  analysis  precise,  let 

Va(t)  =  a  [' g(X(s))d3 
Jo 

where  g  does  not  depend  on  a  and  set  d(a)  =  EDa ,  where 

Da=  r  oxP(-Va(t))f(X(t))dt. 

Jo 

We  are  interested  in  the  efficiency  of  our  five  estimators  for  D(a)  when  a  is  small.  Given 
Theorems  1  through  5,  we  examine  the  asymptotic  behavior  of  the  scaling  constants  ap¬ 
pearing  in  front  of  the  limiting  normal  r.v.  These  scaling  constants  determine  the  width 
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of  the  confidence  interval  associated  with  a  given  method,  and  consequently  one  wishes  to 
choose  estimators  for  which  the  scaling  constants  are  as  small  as  possible. 

Our  subsequent  mathematical  analysis  requires: 

H8.  X  is  a  (possibly)  delayed  regenerative  process  with  regeneration  times 
0<  T0  <  Tx  <  ... 

H9.  E(Yi{f)*  +  Yi{g)A)  <  OO  (»  =  0, 1),  where 

YiU)  =  [T>  f(X(s))dx 
JTi-i 

K(ff)  =  r~l  a(X(s))dx 
JTi-i 

and  T_i  =  0. 

Although  the  results  stated  here  require  the  regenerative  structure  for  the  proofs,  it  seems 
likely  that  the  same  asymptotic  behavior  holds  for  more  general  classes  of  processes.  This 
belief  is  supported  by  some  of  the  more  general  limit  theorems  appearing  in  Glynn  and 
Whitt  (1987). 

To  state  the  following  theorem,  we  add  an  a-dependence  to  all  the  r.v.’s  and  constants 
appearing  in  Theorems  1  to  5.  For  example,  Da(l)  is  defined  as 

/  f(X(t))dt. 

Jo 

Theorem  6.  Assume  H1-H8.  Then: 

(a)  <f(a)  ~ 

(b)  var  Da  ~ 

(c)  c(a)  ~  a  r(g) 

(d)  var£>a(l)  ~ 

(e)  EVa~'(R)  ~  (ar(g))"> 

(f)  var  A,  (2)  ~ 

(g)  EM(a)~(ar(g))-'§ft 

(h)  (cr0 (c*)  +<ri(a))2  ~  y 

(i)  (50(a)  +5i(a))2  ~ 

as  a  1  0,  where  r(/)  =  EY^/Er^rlg)  =  EYi[g)/Eri, 


var(r(g)yt(/)  -r(/)r!(g)) 


Er\ 


Eri 
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Given  a  computation  budget  of  (at  least)  moderate  size  t,  the  above  theorem  tells  us 
that  if  the  discount  rate  is  small,  then  we  can  expect  that 

.  , .  log*  Ci  <73 

»««,(<)-— 

varA,M  1  Cl  r*( / ) 

v  r  w  1  £l  **{/)  Ern 

Var5s(t)w7  'a3  '  r®(y)  '  Etj. 

var6A{t)  »  i  • 

t  a3  r(g ) 

x  1  e3  a3  Ern 

1  '  t  a3  r($) 

Assuming  that  e}Erh  <  cxEti  for  j  =  2,3  (i.e.  the  cost  of  simulating  a  regenerative  cycle  in 
discrete  time  is  less  than  or  equal  to  the  cost  of  simulating  a  cycle  in  continuous  time), 
the  above  analysis  suggests  that  we  can  order  (for  small  discount  rates)  the  estimators  in 
order  of  decreasing  preference  as  follows:  MO.MO.MO.MO.MO- 

For  larger  discount  rates,  we  would  expect  that  the  logt  term  in  var$i(t)  would  dominate 
the  additional  factor  of  l/a  appearing  in  var$2(t)  and  var<S3(t).  Thus,  for  larger  discount  rates, 
we  recommend  using  the  estimators  in  the  order  65  (0 » MO  >  MO.  MO  .MO-  This  is  a  heuristic 
analysis,  however,  and  for  a  given  budget  t,  there  may  be  some  realignment  in  this  order. 

The  above  results  also  show  that  the  discounting  problem  does  not  get  harder  as 
a  |  0,  provided  that  we  take  advantage  of  regenerative  structure.  Suppose  that  we  wish 
to  construct  a  100(1  -  S)%  confidence  interval  for  d(a)  with  half-width  equal  to  e%  of  d(a). 
If  the  estimator  d6(t )  is  used,  the  computational  effort  t(a)  required  for  this  relative  width 
confidence  interval  is  given  approximately  by 


t(a)  « 


*3(6)&3c3  ■  Erhr(g) 
e3Eri  r(/)3 


where  z(6)  solves  P{N(0,  l)  <  «(£)}  =  1  -  S/2.  Since  the  right-hand  side  does  not  depend  on 
a,  this  shows  that  the  discounting  problem  does  not  get  harder,  in  a  relative  error  sense, 
as  the  discount  rate  is  driven  to  zero.  This  is  in  contrast  to  the  problem  of  estimating 
steady-state  queue-length  in  heavy-traffic,  where  the  relative  error  problem  does  get  harder 
as  the  traffic  intensity  increases  to  one  (see  Whitt  (1987)). 
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APPENDIX 

Proof  of  Theorem  1.  First,  observe  that  the  positivity  of  /  shows  that  the  bias  b(0)  is 
positive  for  all  0.  Furthermore,  by  H2,  it  is  evident  that  b[0)  converges  to  zero  as  0  oo, 
and  thus  a  and  b  must  be  positive  finite  constants. 

We  now  apply  the  results  of  Fox  and  Glynn  (1987)  to  obtain  the  theorem;  it  is  easily 


checked  that  their  hypotheses  are  in  force.  Their  Proposition  1  states  that  0{t)  ->  oo  as 
t  -*  oo  is  necessary  for  consistency  of  S^t),  while  their  Theorem  2  proves  that 

-<*)=►  +  7  (Ai) 

as  t  -»  oo,  where  q(t)  =  (t/ci0(t)  vxr  D(0{t)y/2,  and  7  =  lim*-.,*,  q(t)b(0(t)).  Since  0(t)  -*  oo,  it  is 
evident  that 

varD(0(t))/  var  0  — * 1  (-42) 

as  t  -»  oo.  Furthermore, 

•‘"w‘»  -  r  “>  [-<«*>  -  ■  <"> 


Hence,  if  /9(t)  =  0*(t),  it  follows  that  tl/*b(0(t))  converges  to  a  finite  constant,  so  that  7  =  0; 
part  i)  is  then  obtained  by  using  (Al)  and  (A2).  Similarly,  for  part  ii),  tlf3b(0[t))  -*  o  so 
that  7  =  0  and  the  result  again  follows  immediately  from  (Al)  and  (A2).  Finally  for  iii), 
(A3)  shows  that  for  t  sufficiently  large, 

|  exp  [j(l  -  *)logt]  = 

so  that  7  =  oo,  yielding  the  result. 

Proof  of  Proposition  1.  We  can  write  0(1)  as 

p(i)=  r  i[v-'(R)>t)nx(t))dt 

Jo 

=  r  i{R>v(t))f(x(t))dt. 

Jo 

The  result  then  follows  from  Lemma  1  below,  by  noting  that  the  independence  of  X  and 
K  proves  that  E{I(R  >  K(t))/(*(t))|A)  =  f(X(t))P{R  >  K(t)|X)  =  f(X(t))*xp(-V(t)). 

Lemma  1.  Let  Z  be  a  nonnegative  process  on  a  probability  space  (fl,  7,  P)  such  that 
E  /0°°  Z(t)dt  <  oo.  If  $  is  a  sub-o-field  of  T,  then 

E{  r  Z{t)dt\9)  =  r  E{Z(t)\9)dt  a.,. 

Jo  Jo 
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Proof  of  Lemma  1.  We  use  the  defining  relation  for  conditional  expectation,  as  given 
on  p.  298  of  Chung  (1974).  Note  that  /0°°  E{Z(t)\$)dt  is  a  ^-measurable  r.v.  such  that  if 

A  €  p,  ^  ^ 

E  QH  E{Z(t)\$}dt  /(A))  =  J~  E{I{A)E{Z(t)\S))dr 

=  r  E[I(A)Z{t))dt 

Jo 

-e(j~  zw-iwy, 

the  first  and  third  equalities  use  Fubini’s  theorem,  whereas  the  second  follows  from  the 
defining  relation  for  E{Z(t)\9).  We  have  therefore  demonstrated  that  /0°°  E{Z[t)\$}dt  satisfies 
the  defining  relation  for  E  {/0°°  Z(t)<h|5},  proving  the  result. 

Proof  of  Theorem  2.  The  CLT  for  5a(t)  follows  immediately  from  Section  5  of  Glynn 
and  Whitt  (1986).  For  the  expression  for  var£>(l),  note  that 

ED( l)2  =  2E I  ^  Jo  f(X(s))f(X(t))dsdt  J 

=  2 E  { jH  J*  I(R  >  V(t))/(X(3))/(X(t))^dt} 

=  fy  jf  E{I(R  >  V(t))f(X{a))f(X{t))dsdt). 

But  E{I(R  >  V[t))f(X(»))f(X(t))\X)  =  fW'MXit))  P{R  >  V(t)}  =  /(jr(.))/(JT(t)) exp(-K(t)), 
yielding  the  formula.  A  similar  proof  gives  the  expression  for  EV~1(R). 

Proof  of  Proposition  2.  For  the  first  formula,  note  that 

P{M  >  m|y)  =  >  sm|y} 

=  p{r  >  v(5m)|y> 

=  E{P(R  >  y(5m)|x}|y) 

=  £{exP(-y(sm))|y} 

=  n  s{exp(-,(y,)^)|y} 

y=o 

=  nV(KJ1KJ+llS(y))), 

o 

from  which  the  result  follows.  For  the  second  expression,  observe  that  E{0k\Y,  M  =  m}~ 

E{pkI(M  -  m)|y}/P(W  *  m|y}.  To  analyze  the  numerator,  note  that  for  Jfc  <  m, 

E{foI{M  >  m)|y}  -  E{E{fikI{R  >  y(Sm))|X}|y} 

=  ^*«p(-y(5m))|y) 

m—  1 

*  -p'h  n  Pi 
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so  that  E{0kI(\f  =  m)|y}  =  •  (1  -  <f>m)  ■  Tip1  Vi-  This,  when  combined  with  the  first 

formula,  yields  the  second  identity. 

The  proof  of  the  third  formula  follows  a  similar  pattern.  We  write 

-  Su\Y,  M  =  m}  =  £{(y-*(£)  -  Su)J(M  =  m)\Y}/P{M  =  m|y};  again,  we  handle  the 
denominator  using  the  first  formula.  For  the  numerator,  we  note  that  E{V~1(R)I(M  =  m)|y} 
can  be  expressed  as 

>t,M  =  m\Y}dt 

ao  oo  (44) 

=  [°°  P{R>V[ty,R>V{Sm)\Y}dt-  f°°  P{R>V(t)iR>V{Sm+l)\Y}dt 
Jo  Jo 

By  conditioning  on  X ,  we  find  that  P{J*  >  max(y(t),y(Sfc))|y)  =  E{exp(-max(y(t),y(Sfc))|y}. 
So  by  Lemma  1 

PiR  >  max(y(i),y(5fc))|y}^  =  E  m«(y(t),y(Sfc)))«ft  J 


=  E{Sk  exp(-y(5fc))|y}  +  E  Oexp(-y(t))*|y  } . 


On  the  other  hand,  E{SmI(M  =  m)|y}  =  0{SmJ(y(Sm)  <  R  <  y(Sm+1))|y}  = 

£{5m(exp(-y(5m)) -exp(-y(5m+i)))|y}.  Combining  this  with  (A4)  and  (A5)  shows  that  the 
numerator  equals 

0  j«cp(-y(Sm))  oxj>(—g(Ym)t)dt  -  0m  exp(-S(ym)y9m)  j  |yj 

Dividing  by  the  denominator  gives  the  third  formula. 

Proof  of  Theorem  4.  Standard  weak  convergence  arguments  prove  that 

Mt)  =  1-4, (0)  +  Ct[Q)EK[\))  +  [rf^(g(1)(^(l)  -  EK(l)[l  -  <?«(!)))]  +op(r1/2) 

where  op(t_1/a)  represents  a  process  x(t)  such  that  =>  0.  The  random  time-change 

results  of  Section  5  of  Glynn  and  Whitt  (1986)  can  now  be  applied  to  the  bracketed  terms 
above  to  obtain  the  result.  (To  show  that  the  respective  variances  are  finite,  see  the  proof 
of  Proposition  3.) 

Proof  of  Proposition  3.  The  regenerative  structure  of  X  proves  that  (=  denotes  equality 
in  distribution) 

0=4(0)  +  C(0)A(1) 


where  (4(0),  C(0))  is  independent  of  tf(l).  Squaring  both  sides  and  taking  expectations,  we 
get 


ED3  =  EA(0)3  +  2£4(0)C(0)£A(1)  +  EC{Q)3EK(l)3.  (46) 


Since  all  terms  on  the  right-hand  side  are  positive,  we  see  that  H2  implies  the  finiteness  of 
all  the  quantities  appearing  there.  We  apply  the  same  analysis  to  Jf(l): 


tf(l)£4(l)  +  C(l)*(2). 


Using  the  fact  that  K(2)^K(\),  we  get  EK{ l)2  =  (£4(1)2  +  2£4(1)C(1)  •  ^7A"(1))(1  -  EC{l)a)-1. 
(EC2( l)  <  1  by  HI).  Substituting  this  into  (A6)  yields  the  result,  after  algebraic  simplifica¬ 
tion. 

Proof  of  Theorem  6.  For  a)  we  use  (8)  and  let  a  1  0.  A  Taylor  expansion  gives 


4a(0)  =  fT°  exp(-aV(t))f(X(t))dt 
Jo 

=  J*  [l  -  aV(t)  +  ^V2(£)e-^<‘>]  f(X(t))dt 


where  7  =  ir(a)  e  (0,a).  Since  V2(t)exp(-i(a)V(t))f(X(t))  <  V2(r)f(X(t))  uniformly  in  a  and 

[T°  V3(r)HX(t))dt  =  Yo{f)Y0{g)3 
Jo 

is  integrable  by  H9,  it  follows  from  the  dominated  convergence  theorem  that 


£4.(0)  =  EYo(f)  -  ocE  {1*°  V(t)f(X(t))dt  J  +  YE{j*°  V3(t)fWt))dt  J  +  o(a2). 


(47) 


Similarly,  one  can  show  that 


ECa( 0)  =  1  -  aEY0(y)  +  y  EY0(y)3  +  o(a2). 

Corresponding  expressions  for  £4a(l)  and  ECa(  1)  lead  immediately  to  a).  For  b),  h),  i),  we 
use  Propositions  3  and  4  and  arguments  similar  to  the  above.  Relation  c)  can  be  found  in 
Glynn  and  Whitt  (1987). 

For  e),  observe  that 

^a_1W  =  ia(0)  +  /a(0)Qa(l)  (48) 


where  La( 0)  =  V-»(«)  a  ro,/a(0)  =  I{V~l{R)  >  T0),Qa(  1)  =  V~»(r)  -  T0,  and  (La(0),lo(0))  is 
independent  of  Q«,(l).  Furthermore,  on  {V~l(R)  >  T0}, 

Qo(l)=io(l)  +  /a(l)Qo(2)  (49) 
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where  X«(l)  =  (V"l(S)  a  Tx)  -  T0l/a(l)  =  I{V~l(R)  >  ri).<?«(2)  =  V;'(R)  -  Tu(La[ l),/.(l))  is 
independent  of  Q«(2),  and  the  distribution  of  Qa(i)  conditional  on  {Va-1(A)  >  2*}  is  indepen¬ 
dent  of  i  (*'  =  0, 1).  Taking  expectations  in  (A8)  and  (A9)  and  using  the  independence  leads 
to  an  expression  similar  to  (8).  One  then  expands  the  expectations  in  a  manner  similar 
to  (A7)  to  obtain  e).  Results  d),  f),  and  g)  are  proved  using  decompositions  analogous  to 
(A8)  and  (A9),  followed  by  Taylor  expansions  for  small  a. 
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